Contribution of Previously Unrecognized RNA Splice-Altering Variants to Congenital Heart Disease

Background: Known genetic causes of congenital heart disease (CHD) explain <40% of CHD cases, and interpreting the clinical significance of variants with uncertain functional impact remains challenging. We aim to improve diagnostic classification of variants in patients with CHD by assessing the impact of noncanonical splice region variants on RNA splicing. Methods: We tested de novo variants from trio studies of 2649 CHD probands and their parents, as well as rare (allele frequency, <2×10−6) variants from 4472 CHD probands in the Pediatric Cardiac Genetics Consortium through a combined computational and in vitro approach. Results: We identified 53 de novo and 74 rare variants in CHD cases that alter splicing and thus are loss of function. Of these, 77 variants are in known dominant, recessive, and candidate CHD genes, including KMT2D and RBFOX2. In 1 case, we confirmed the variant’s predicted impact on RNA splicing in RNA transcripts from the proband’s cardiac tissue. Two probands were found to have 2 loss-of-function variants for recessive CHD genes HECTD1 and DYNC2H1. In addition, SpliceAI—a predictive algorithm for altered RNA splicing—has a positive predictive value of ≈93% in our cohort. Conclusions: Through assessment of RNA splicing, we identified a new loss-of-function variant within a CHD gene in 78 probands, of whom 69 (1.5%; n=4472) did not have a previously established genetic explanation for CHD. Identification of splice-altering variants improves diagnostic classification and genetic diagnoses for CHD. Registration: URL: https://clinicaltrials.gov; Unique identifier: NCT01196182.

inclusive of coding variants, potential donor gain (DG) had a calculated ΔMaxEnt > 0 and MaxEnt VAR > 4.1 and for potential acceptor gain (AG), the calculated MaxEnt VAR > 7.1.We only assessed DG and AG variants within genes with high heart expression (HHE), defined as the top quartile of expression from RNA sequencing of mouse heart at embryonic day 14.5 23 .
For rare variants analyses, we considered only single-nucleotide variants (SNVs) within the 5'ss and 3'ss regions.As 32,695 PCGC and 664,697 gnomAD rare variants were identified, we studied only variants within a set of 253 CHD genes (Supplemental Table III) 7 and that are predicted to cause donor loss or acceptor loss (DMaxENT < 0 with Regress_Score.v.095.R), as our earlier functional assays 11 more consistently confirmed these in comparison to DG and AGs.

Minigene design, synthesis, and construction
Using the Construct_design.R tool 11 we designed paired minigene constructs (≤ 500 bp) with the variant (ALT) or reference (REF) splice sequence, an exon with the donor site, a truncated intron, an exon containing the acceptor site, and a 2-bp barcode sequence to allow for multiplexing.After synthesis, (Integrated DNA Technologies or TWIST Biosciences), minigenes were PCR-amplified and purified, and both a CMV promoter and poly-A tail were added 10 , resulting in constructs ~1,200 bp in size (Supplemental Figure I).

In vitro assay using HEK cells and RNA sequencing
Pooled minigene constructs (n= 20; 100 ng) were transfected into HEK293T cells using Lipofectamine 2000 (Thermo Fisher), concurrently with the pMaxGFP plasmid to confirm successful transfection.After 24 hours, RNA was collected in TRIzol (Thermo Fisher), isolated using phenol-chloroform extraction, and quality assessed by calculating the RNA integration number equivalent (RIN) using Agiliant Tapestation 3000.Samples with RIN > 9.0 were used to construct cDNA libraries using SuperScript III (Thermo Fisher) and prepared for sequencing using the Illumina MiSeq platform.

Sequencing analysis and burden analysis
Sequences were processed to trim adapters and pairs were matched using FLASH (https://ccb.jhu.edu/software/FLASH/) using previously published scripts (located at https://GitHub.com/SplicingVariant/SplicingVariants_Beta).We assessed splicing only when sequence data contained greater than 100 reads for both REF and ALT constructs, and REF constructs showed greater than 10% normal splicing.Variants within constructs that did not meet these criteria were considered "indeterminate" while others were classified as normal splice, no splice, or aberrant splice.Each splice outcome was normalized to 100 and ratios were calculated for aberrant splicing vs. normal splicing.We calculated P-values using Fisher's exact test to compare ratios of REF and ALT constructs and considered P < 0.05 as significant.
Burden analyses for splice variants in CHD cohorts compared to control cohorts were performed by calculating the number of variants per individual in each cohort and comparing these ratios using right-tailed binomial test.

RNAseq of cardiac tissue
RNA was obtained from discarded pulmonary artery and right ventricular tissues obtained during surgery from CHD probands.CHD probands were recruited from two centers into the Congenital Heart Disease Genetic Network Study of the Pediatric Cardiac Genomics Consortium (CHD Genes: NCT01196182).The protocol was approved by the Institutional Review Boards of Boston Children's Hospital, Brigham and Women's Hospital, Great Ormond St Hospital, Children's Hospital of Los Angeles, and Yale School of Medicine.Written informed consent was obtained from each participating subject or parent/guardian.RNA was extracted using the phenol-chloroform method and solubilized in nuclease-free water.RNA (RNA integrity index (RIN) >8.0) was quantified (Agilent TapeStation 2200), and RT-PCR was processed (using a Nextera kit) for sequencing on an Illumina HiSeq or NextSeq.20-30 million reads were obtained per RNAseq library.

Computational Prioritization of Variants using SpliceAI
We also used the SpliceAI algorithm, applying scores reported to yield high recall (0.2) and high precision (0.8) thresholds 8 to assess rare variants in CHD probands and gnomAD controls.We assessed the positive predictive values (PPV) and negative predictive values (NPV) of SpliceAI in CHD samples studied by splice assays and classified variants as true positives and true negatives when SpliceAI predictions and minigenes assays agreed.False positive denotes variants predicted by SpliceAI to cause abnormal splicing that were not confirmed by minigenes assays.Conversely, false negatives variants denote those predicted to have no effect by SpliceAI but that altered splicing in minigenes assay.